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1. Introduction 

Consider a triple (F, C, X) of random variables defined in ]R+ x ]R+ x R'', 
d > 2, where Y is the variable of interest (typically a lifetime variable), C a 
censoring variable and X = (Xi, . . . ,Xd,) a vector of concomitant variables. In 
most practical applications, such as epidemiology or reliability, the relationship 
between Y and X is of particular interest. Denoting by ■(/; a given measurable 
function, we will focus here on the study of the conditional expectation of V'(^) 
given X = x. 



The introduction of the function will allow us to treat simultaneously the stan- 
dard regression function and the conditional distribution function (see Remark 
2.4 below). 

In the right censorship model, the pair {Y, C) is not directly observed and 
the corresponding information is given hy Z = min{y, C} and 5 = I{y<c}i 
standing for the indicator function of the set E. Therefore, we will assume 
that a sample I?„ ~ {{^i, 5i,'K.i)^i ~ 1, . . . , n} of independent and identically 
distributed replicae of the triple {Z, 6, X) is at our disposal. In this setting, 
transformations of the observed data I?„ are usually needed to estimate func- 
tional of the conditional law of Y (see, e.g., [4, 17, 28-30] and the recent work 
of [35]). Estimators based on these transformations are usually referred to as 
synthetic data estimators. In this paper, following the ideas initiated by [28], we 
use a nonparametric version of particular synthetic data estimators, commonly 
referred to as Inverse Probability of Censoring Weighted [I .P.C'.W.] estimators 
(see [3, 5] and [27] for some results related to nonparametric I.P.C.W. esti- 
mates of the censored regression function). It is however noteworthy that the 
methodology we propose here for I .P.C'.W-type estimators shall apply with mi- 
nor modifications to cope with other synthetic data estimators (see Paragraph 
3.1 below). 

A well-known issue in nonparametric estimation is the so-called curse of di- 
mensionality: the rate of convergence of nonparametric estimators generally 
decreases as the dimensionality d of the covariate increases. To get round this 
problem, one solution is to work, if possible, under the additive model assump- 
tion, which allows to write the regression function as follows. 




for all X = (xi, . . . , Xd) e R''. 



(1.1) 



d 




(1.2) 
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In (1.2), the real- valued functions m^j, I — 1, . . . , d, are defined up to an addi- 
tive constant, and the assumption ]Em^/(X£) = 0, ^ = 1, . . . , d, is usually made 
to ensure identifiability. This assumption implies [i = Eil){Y). In the uncensored 
case, several methods have been proposed to estimate the additive regression 
function. We shall evoke, among others, the methods based on B-splines [3G] , on 
the backfitting algorithm [21, 23, 32] and on marginal integration [34, 40, 31]. 
In [17], Fan and Gijbels established the asymptotic normality for estimators 
obtained via the backfitting algorithm combined with various synthetic data es- 
timators. In [3], Brunei and Comte considered additive models as special cases 
in their study of adaptive projection I.P.C.W. estimators. Here, following the 
ideas introduced in [10], we make use of the marginal integration method, cou- 
pled with initial kernel-type I.P.C.W. estimators to provide an estimator for the 
additive censored regression function. This combination leads to estimators for 
which the theory is easier to derive, which was wanted here, given the technical- 
ities in the proof, even in this simplified setting (note however that, as already 
mentioned, extensions to other synthetic data estimators can be obtained; see 
Paragraph 3.1). In a previous work [10], the mean-square convergence rate was 
established for the integrated estimator defined in (2.7) below. In the present 
paper, we get the exact corresponding rate of strong uniform consistency (see 
Theorem 3.2 below). Our limit law corresponds to the extension of Theorem 
2 in [9] to the censored case. Moreover, following the ideas developed in [13], 
asymptotic simultaneous 100% confidence bands are derived for the true regres- 
sion function. This kind of bands may be complementary to the more classical 
(1 — a) X 100% pointwise confidence intervals derived from CLT type results 
(see Section 4). 

2. Hypotheses-Notations 

Before presenting our estimator and stating our results, we shall introduce some 
notations as well as our working assumptions. First consider the hypotheses to be 
made on the random triple (F, C, X). Introduce, for all t gR, F{t) = P(y > t), 
G{t) = P(C > t) and H{t) = P(Z > t), the right continuous survival functions 
pertaining to C and Z respectively. 

(C.l) C and Y are independent and V{Y < C|X, Y) = V{Y < C\Y). 
(C.2) G is continuous. 

(C.3) is s-times continuously differentiablc, .s > 1, and 
id" 

I dxi' . . . dx"/ "^A^^n < si + --- + sd^ s- 

Remark 2.1. Assumption (C.3) will allow to control bias terms. Assumptions 
(C.l) and (C.2) are essentially needed when using most synthetic data estima- 
tors. (C.2) allows to use convergence results for the Kaplan-Meier [25] estimator 
of G. In addition, (C.l) especially allows to derive the result (2.1) below, which 
is a fundamental requirement for synthetic data. This assumption was also used 
by Stute [^IS] in another context. It is however noteworthy that Beran [2] (see 
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also [S] and [12]) worked under the weaker assumption of conditional indepen- 
dence between Y and C given X to derive properties for a local version of the 
Kaplan-Meier estimator. On the other hand, to use Beran's local Kaplan-Meier 
estimator the censoring has to be locally fair, i.e., such that P(C > i|X) > 
whenever W{Y > t|X). Here, (see assumption (A.)(ii) below), we essentially 
suppose that G{t) > whenever F[t) > 0, which is on its turn a weaker as- 
sumption. For a nice discussion on the differences between the assumptions to 
be made when using either Beran's estimator or I.P.C.W. type estimators, we 
refer to [5]. 

Denote by / [resp. fe, £ = 1, . . . ,d] the density of X [resp. Xg, £ = 1, . . . ,d]. 
Further let Ci, . . . , Cd, he d compact intervals of IR with non empty interior, 
and set C = Ci x • • • x the corresponding product. For every subset £ of IR"^, 
7 > 1, and any a > 0, introduce the a-neighborhood of £, 

£'^ — {x : inf \x — ?/|r<! < a}, 
yes 

with I • Ir? standing for the usual euclidian norm on M'. The functions / and 
fi, £ = 1, . . . jd, will be supposed to be continuous, and we will assume the 
existence of a constant a > such that the following assumptions hold. 

(C.4) Wxi G C^Jiixi) > 0, ^ = 1,.. and Vx e C",/(x) > 0. 
(C.5) / is s'-times continuously differentiable on C", s' > sd. 

Remark 2.2. Assumption (C.4) is classical when dealing with kernel type es- 
timators of the regression function (see, e.g., [IJ, 15]). The fact that s' > sd in 
(C.5), when combined with {C.3) above and (if. 1-2) and {H.4) below, allows to 
derive easily the results pertaining to the case where the density function f is un- 
known from the ones obtained in the simpler case where this function is known. 
Some refinements in our proofs might allow for relaxing (C.5) (see, e.g., [-1]). 

Recalling (1.1), we will let ip vary in a pointwise measurable VC subgraph 
class T of measurable real- valued functions defined on IR (for the definitions of 
pointwise measurable classes of functions and VC subgraph classes of functions, 
we refer to p. 110 and Chapter 2 in [41]). We will also assume that T has a 
measurable envelope function T(y) > sup^g^p |V'(2/)|) 2/ G such that 

(C.6) T is uniformly bounded on H. 

Remark 2.3. In the uncensored setting, (C.6) can be replaced by some finiteness 
condition on the moment of order 2 ofT(Y) (see [I i] or [I-'i]). In the censored 
setting however, such refinements are useless due to the assumption (A) below. 

Remark 2.4. Choices of particular interest for the class T are Treg = {-^}; 
where I denotes the identity function on ]R and ^Qgj^ = {I(-oo,t]j^ S M}. Con- 
sidering the class Treg allows to treat the case of the classical regression func- 
tion. On the other hand, considering the class T^^j allows to derive the uniform 
consistency ( especially over t S for estimates of the conditional distribution 
function. We refer to [J '] for examples in the uncensored case. 
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We will further employ sequences of positive constants {hn}n>i and {hi^n}n>ii 
1 < £ < d, satisfying the following conditions. 

{H.l) hn i 0, he,n i 0, n/ij^ t oo and n/i£.„ "f oo as n cx). 

{H.2) nhf^/ logji — !■ oo and n/i^^„/logn ^ oo as n ^ oo. 

(7J.3) nhi^n 11^=1 log'i^,"! ^ 0, for all si H h Sd = s, as n ^ oo. 

(HA) he,„logn/{hf^\loghi^n\) ^ as n oo. 

(i/.5) loglogn/l log/i,i| — > and loglogn/| log/i£^„| — > as n — > oo. 

Remark 2.5. Assumptions {H.1-2-5) are classical in the empirical process 
theory, and are often referred to as the Csdrgo-Revesz-Stute [CRS] conditions 
[7, ST]. They especially allow to control variance-type terms. On the other hand, 
assumption (i?.3) allows to control bias terms (see Lemma 5.8 below). As al- 
ready mentioned, assumption {H.A) allows to derive easily the results pertaining 
to the case where the density function f is unknown from the ones obtained in 
the simpler case where this function is known. 

As mentioned in [19], functionals of the (conditional) law can generally not 
be estimated on the complete support when the variable of interest is right- 
censored. So, to state our results, we will work under the assumption (A), 
that will be said to hold if either (A)(i) or (A)(m) below holds. For any right 
continuous survival function L defined on R, set Tl ~ snp{t G H : L(i) > 0}. 

(A)(i) There exists a oj < Th such that, for all ^ G T, = on {oj, oo). 

(A)(m) (a) For a given < p < 1/2, /J''' -F-P/(i-P)dG < oo; 

(b) Tf<Tg] 

(c) n^^^^hjl^l \og{hi^n)\ ^ oo, as n — > oo, for every £ = 1, . . . ,d. 

It is noteworthy that the assumption (A)(m) is needed in our proofs when con- 
sidering the estimation of the "classical" regression function, which corresponds 
to the choice ip{y) = y. On the other hand, rates of convergence for estimators 
of fimctionals such as the conditional distribution function 1P{Y < i|X) can be 
obtained under weaker conditions, when restricting ourselves to t e [0,0;] with 
u < Th. 

These preliminaries being given, we can recall the procedure we proposed 
in [10] to estimate the censored regression function under the additive model 
assumption. Let K be a bounded and compactly supported kernel on IR*^. By 
kernel, we mean as usual a measurable function integrating to one on its support. 
We define the kernel density estimator /„ of / by 

Now, as was observed notably by Koul et al. [28], we have under (CI), 

^{Wy^ =E{^E(V<t:}|X,y)| x| =E(V'(r)|X). (2.1) 
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Then, denoting by G* the Kaplan-Mcicr [2"i] estimator of G, kcrncl-type estima- 
tors of the multivariate regression function to^(x) defined in (1.1) can be easily 
constructed [27]. Here, because marginal integration will further be applied, 
the internal estimator idea of Jones [24] has to be used. That leads us to con- 
sider the following multivariate I.P.C.W. kernel-type estimator of the regression 
function. 

Here the kernel functions Ki, £ = l,...,o?, defined in H are supposed to 
be continuous, of bounded variation (i.e. such that < J^\dKi{t)\ < oo) 
and compactly supported. Recalling that a kernel function F defined in H"^ 
is said to be of order 7, for any a > 1, whenever (a) and (6) below holds 
jointly, 

(a) J^, < . . . u^'r{u)du = 0, ji, . . . , id > 0, ji + • • • + = 0, 1, . . . , 7 - 1; 
(6) /j^, |uf . . . u^J' |r(u)rfu < 00, ji , . . . , jd > 0, ji + • • • + jd = 7; 

we will also impose the conditions (_ftr.l-2). 

(K.l) K := Hfci is of order s. 
\k.2) K is of order s' . 

In order to apply the marginal integration method (see [31, 34]), introduce 
qi,...,qd, d given density functions defined in M. Further set, for all x = 
(xi,.., Xd) € R'', q{x) = Y['e=i<ldxe) and, for every £ = l,...,d, q_i{x_i) = 
Ilj^elji^j) with x_f = {xi, ..,xi_i,Xi+i, ..,Xd). Now, we can define 



VipA^e) = / ■m.^{yi)q^i{yL_i)dy._i, ~ I m^^{x)q{x)dx, ^ = 1, . . . , d, (2.3) 

in such a way that, recalling (1.2), the two following equalities hold, 

il^,£{xe) = m^p^x^) - m^Au)qi{u)du, e = l,...,d, (2.4) 
Jr 

d „ 

m^{x) =y2ri^,A^e) + m^{u)q{u)du. (2.5) 
1=1 -^R" 

In view of (2.4) and (2.5), for every £ = 1,. ..,d, ij^j and m^,^e arc equal up 
to an additive constant, so that the functions rj^^i are actually some additive 
components, which coincide with m^^t for the choice q£ = fg (which is only 
achievable if fg is known). From (2.2) and (2.3), a natural estimator of the i-th 
component r]^,^i is given, for £ = 1, . . . , d, by 

%A^'^')= m^,^„(x)(7„£(x„f)dx_<? - / mj;,„(x)g(x)dx. (2.6) 
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From (2.5) and (2.6), an estimator mj^ ^^^^ of the additive regression function 
can be deduced, 



d „ 

^V-.addW = y'??^,/(a;£) + / "i^,„(u)'7(u)<iu. 
£=1 -^R" 



(2.7) 



In the sequel, we will assume that the known integration density function qg has 
a compact support included in Ce, £ = 1, . . . , d. Moreover, we will impose the 
following assumption on the functions q-£, £ = 1, . . . , d. 

(Q.l) q-t is a bounded and s-times differentiablc function such that 

-q-e{x^e) 



sup^ 



< oo, ii + ■ ■ ■ + id = s, E = 1, . . . ,d. 



Before stating our main results, some additional notations are needed. For all 
£ T , dl\ Vi ~ (ui, . . . , Ud) e C and every £ = 1, . . . , d, set 

H^{vi) = Ef^^|X = u) (2.8) 



GiY) 

and (f>4,,i{ui) / g^(u) q_^(^^j^_^^(i^_^_ (2.9) 
Further set, for all G and every £ — 1, . . . ,d, 

a^^i= sup W%4^^ [ Kf, ae=swpa^,e and (T = 'S^ai. (2.10) 



=1 



3. Main results 

We have now all the ingredients to state our results. From now on, will 
stand for almost sure convergence. Theorem 3.1 below describes the asymptotic 
behavior of the additive component estimates rf^ £, £ = 1, . . . , d, defined in (2.6). 

Theorem 3.1. Under the hypotheses (A), (C.1-2 -3-4-5-6), (iJ.1-2 -3-4-5), {K.l- 
2) and (Q.l), we have, for £ ~ 1, . . . , d, 



\ I ™P '^'^P ei^e) " Vi^A^e)} ^ cTi as oo, (3.1) 

y 2| 10g/lf,„| ^g^;j;,gC, ^' 

where is as in (2.10). 

From Theorem 3.1, we will deduce an analogous result for the additive re- 
gression function estimator defined in (2.7). 

Theorem 3.2. Assume the hypotheses of Theorem 3.1 hold. If, in addition, 
hi^n ~ hi^n for every £ = 1, . . . , d, then we have, 



sup sup±{m^ ,j^^(x) — m,/,(x)} fT asn^oo. (3.2) 



y 2| log/li^nl ^g;c-xgC 

where a is as in (2.10). 
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Keep in mind that a similar result is readily obtained for the conditional 
distribution function by selecting = {I[o.t],t G 1R+} in Theorem 3.2. 

The proofs of Theorems 3.1 and 3.2 are postponed to Section 5. A sketch 
of the proof of Theorem 3.1 is as follows. It will be split into two main parts. 
First we will assume that both the survival function G of C and the density 
function / of X are known. Then, using appropriate approximations lemmas, 
we will show how to treat the general case (i.e. the case where neither / nor G 
is known). To establish the results in the case where both G and / are known, 
we will mostly borrow the arguments developed in [13] and [15] (see also [10]), 
which rest on recent developments in empirical process theory, and especially 
on an exponential bound due to Talagrand [:^)9] (see also Inequality A.l in the 
Appendix). 

In the following Paragraph 3.1, we show how our results may be extended to 
the case of more general synthetic data. In Section 4 we present an application 
of our results, following the ideas developed in [13]. 



3. 1 . Extensions 



Here, we will limit ourselves to the case T = {/}, where / stand for the identity 
function on H. The corresponding estimator defined in (2.2), and then the one 
defined under the additive assumption in (2.7), rest on the following transfor- 
mation, which is due to Koul et al. [2n]: for 1 < i < rt, 

(^-^O — T^^T, (3.3) 
which, in the case where G is known, reduces to 

Note that (3.4) sets a censored observation to and multiplies an uncensored 
observation by a factor [G(Zi)]~^, which can be very large if G{Zi) is near 0. 
Alternative, and more general, synthetic data can be constructed in the following 
way. For any given p G IR, set 



ei(z) = (i + p) 

Q2{z) = {l + p) 



dt pz 



(3.5) 

dt 

wv 



with p chosen such that Qi{Z) > almost surely. Now, consider the transfor- 
mation, for I < i < n, 

{6,,Z,) Y, hQr{Z^) + (1 - k)Q2{Zi). (3.6) 

Observe that (3.4) corresponds to the particular choice p = —1. The choice 
p = is also popular, and was first considered in [30]. Other choices (including 
some data-dependent choices) are discussed in [17]. 
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Consider the following independence assumption. 
(C.l) C and {Y,'X.) arc independent. 

Observe that (C.l) naturally implies (C.l). Under the assumption (C.l), we 
have (see, e.g., [26]) 

E(K,|X) = E(y,|X). 

A close look into the proof presented in Section 5 below reveals that, in the case 
where J- = {/}, Theorems 3.1 and 3.2 still hold when considering estimators 
built on the "general" synthetic data, under the assumption (C.l). The only 
difference is the term i?^(u) = Hi{u) (since t/j = /) defined in (2.8) that shall 
be replaced in the general case by 

Hi{u) = E(?2|x = u), 
where Y := SQiiZ) + (1 - 6)02(2). 



4. Application 



Following the ideas developed in [13], we now present a practical application 
of Theorem 3.1. Recall the definition (2.9) of the functions 4>t(,.e- Then, for any 
fixed Tp e J^, and every £ = 1, . . . , d, let T^^e,nixe) be a consistent estimator of 
T^A^i)^ with T^A^i) = \/4'^,i{^t)/ fnixi). For instance, set 



T^,l,n{xi) 



1 



Further set 



Ln{xi) 



2 1 log /if 



./n(x) 



X T^,l.,n{xi) 



g_^(x_f)dx_^. 







1/2 


|l/2 








.Jr. 





(4.1) 



In view of Theorem 3.1, it is straightforward that, for each < £ < 1, there 
exists almost surely an im) = no{e) such that, for all n> uq, 

Tl-4>A^t) e \%A^f-^ ± (1 + £)Ln{x)] , uniformly over x^ e C^, 
V^^ixe) ^jf^A^"^ ± (1 - e)in(a;)] , for some xg G Cg. 
Therefore, under the assumptions of Theorem 3.1, the interval 

\AnA^i)^ B„_£(a;f)] := [^_^(xf) - L„(a;£), ^,^^(a;<;) + L„(a;£)] , (4.2) 

provides asymptotic simultaneous confidence bands (at an asymptotic confidence 
level of 100 %) for r].^,A^d over xg S Ci (see [13] for more details). It is note- 
worthy that our bands do not provide confidence regions in the usual sense. 
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since they arc not based on a specified confidence level 1 — a. Instead, they hold 
with probability tending to 1 as n ^ oo, and are then more conservative (since 
they are simultaneous and with an asymptotic level of 100 %). A comparison 
between pointwise (1 — a) x 100% confidence intervals and our simultaneous al- 
most certainty bands can be found in [11]. In most applications, we recommend 
the construction of both types of confidence region to assess the form of the 
relationship between 'ipO^) Bind X. 

Remark 4.1. For finite sample size use, Deheuvels and Mason [Ll] give some 
recommendations on how to ensure that the simultaneous (almost certainty) 
confidence hands defined in (4-2) include the pointwise confidence intervals. See 
Remark 1.7 (pp. 233-235) in [l.i] for more details. 

Illustration: a simple simulation study 

In this paragraph, we present some results from a simulation study. We worked 
with a sample size n = 1000, and considered the case where X = (Xi, X2) G 
(i.e. d = 2) was such that Xi ~ U{—1,\) and X2 ~ Zi(— 1,1), where U{a,b) 
stands for the uniform law on (a,b). Set mi{x) = 0.5 x cos^(x) and m2{x) = 
0.5 X sin^(a;). We selected = I{.<o.9}i ^iid considered the model IE[?/;(l^)|Xi = 
xi,X2 = X2] = mi{xi)+m2{x2). Under this model, the variable Y was simulated 
as follows. For each integer 1 < i < n, let pi = mi(xi^i) + m2(x2,i) where Xj^i 
is the i-th observed value of the variable Xj, j = 1, 2. Note that < pi < 1 for 
every 1 < z < n. Each Yi was then generated as one U{Q.9 — pi, 1 + 0.9 — pi) 
variable. Following this procedure ensured that F{Yi < 0.9\Xi = Xi) ^ pi = 
mi{xiA) + rn2{x2,i). Regarding the censoring variable, we generated an i.i.d. 
sample Ci,...,C„ such that Ci ~ Zi(0, 1). This choice yielded, a posteriori, 
W{5 = 1) ~ 0.2. We used Epanechnikov kernels (for K, Ki and K2) and selected 
gi = (j2 = 0.5 X (in such a way that the additive component to estimate 

were rj^^j ~ ruj — 0.25, j ~ 1,2). As for the bandwidth choice, we opted a 
priori for ft-iooo = 'ii.iooo ~ ^2,1000 = 0.1. Results are presented in Figure 1. 
The confidence bands appear to be adequate, in the sense that they contain 
the true value of the additive component for "almost" every x G [—1,1]. The 
fact that the true function does not belong to our bands for some points was 
expected: it is due to the e term in (4.1). In other respect, the boundary effect 
pertaining to kernel estimators is perceptible on the plots of Figure 1. In view 
of the assumption (C.4) , we shall however recall that our theorems do not allow 
to build confidence bands on the entire [—1,1], and the plots should only be 
considered on, typically, [—0.9,0.9]. 

5. Proofs 

5.1. Proof of Theorem 3.1 

Only the proof for the first component is provided. The proof for the d — 1 
remaining components follows from similar arguments and is therefore omitted. 
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Fig 1. Results of the simulation study: true additive components (solid line), their estimates 
(red dashed line), and the associated confidence bands (dotted line). 



As already mentioned, we first consider the case where both the survival 
function G of C and the density function / of X are known. 

5.1.1. The case where both f and G are known 

Recall the definitions (2.2) and (2.6) and let m^^n [resp. fj^^i] be the version 
of yj(x) [resp. 77^ j^] in the case where both G and / arc known. Namely we 
have 

ri^iixi) = / m0,„(x)(7_i(x_i)(ix_i - / m0,„(x)q(x)dx. (5.2) 

In this paragraph, we intend to prove the following result, which is the version 
of Theorem 3.1 in the case where both / and G are known. 

Proposition 5.1. Under the hypotheses of Theorem 3.1, we have, 



sup sup ±{?7^, i(a;i) - 7?V',i(a;i)} ^ 0-1, asn^co, (5.3) 



A / O I 1 L I ^ ^ 

y z\ iogfti,„| ^ii^Pxi&c- 
where ui is as in (2.10) 
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In a first step, we will establish Lemma 5.1 below. 
Lemma 5.1. Under the assumptions of Theorem 3.1, we have 



a s 

A/ on h I sup sup ±{r/^ - Ery^ i(a-i)} ^ (Ti, as n oo, 

where ai is as in (2.10). 

Let ip ^ J- he & fixed real-valued, measurable and uniformly bounded function 
defined in H. Following the ideas developed in [15], we will first establish Lemma 
5.2 below, which corresponds to Lemma 5.1 in the case where T is reduced to 
{'0}. Then, we will show how to handle the uniformity over the whole class T 
(see Lemma 5.6 and 5.7 in the sequel). 

Lemma 5.2. Under the assumptions of Theorem 3.1, we have 



y 2|log/ii.„Ui(=Ci 

where cr^^i is as in (2.10). 

The proof of Lemma 5.2 is built on recent developments in empirical process 
theory (see, e.g., [13, 15, 10]). Denote by a„ the multivariate empirical process 
based upon (Xi,Zi,(5i), . . . , (X„, Z„, (5„) and indexed by a class Q of mea- 
surable functions defined on 11''+^. More formally, for g ^ Q, an{g) is defined 

by 

an{g) = ^ V(5(X„Z„*0-Eff(X„Z„<S0)- (5.4) 



1 

J2 (5(X^, Z„ S,) - Eg(X,, Z,,S,)). 



For Xi = {Xis, ■ . ■ ,Xi^d), i < i < n, and xi € Ci, set 

ff-„(X.,Z„5.) = ^gl r„(X,)A-i(^^i^-^), (5.5) 
with T„(X,) = -jj^ f r\-^Ke(^^^^)qiixi)d^.i. (5.6) 

From (5.1), (5.2), (5.4) and (5.5), we successively get the two following equalities 



Vncinig^)n) = nhi,n {m.0.„(x) - Em^,,„(x)}g_i(a;_i)(ix_i, (5.7) 

"^i,n{?7^,i(a:^i)-E^0, 1(2:1)}= Vn\ anig'^lJ- an{g'^^]„)qi{xi)dxi (■ (^'^^ 



R 



Lemma 5.3 below enables to evaluate the respective order of each of the terms 
in the right hand side of (5.8). Its proof is postponed until Section 5.2. 
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Lemma 5.3. Under the conditions of Theorem 3.1, we have, almost surely, 

Q^n(5r„)'?i(a;i)da;i = of sup a„(.g^i„)), asn^oo. (5.9) 

In view of (5.8) and (5.9), the asymptotic behavior of the left hand side of 
(5.8) can be deduced from that of an(5^/„)- Then, foUowing once again the ideas 
of [15] (see also [!•!]), the proof of Lemma 5.2 will be split into an upper bound 
part (captured in Lemma 5.4) and a lower bound part (captured in Lemma 5.5). 

Upper bound part 

Lemma 5.4. Recall the definitions (2.10), (5.4) and (5.5). Under the assump- 
tions of Theorem 3.1, we have, for all e > 0, with probability one, 

hmsup . , , =r ^ (1 + 2£)a^,i. (5.10) 

n^oo ^2/li,„| l0g/li,„| 

Proof. We will first examine the behavior of the process (y.n{g^n) appro- 
priately chosen grid of Ci (partitioning). To do so, we will make use of Bernstein's 
maximal inequality. Then, we will evaluate the uniform oscillations of our pro- 
cess between the grid points (evaluation of the oscillations). Towards this aim, 
we will make use of an inequality due to Mason [3.3] , recalled for convenience in 
Inequality A.l (see the Appendix). 

Partitioning. Let ai and ci be such that Ci = [ai,ci], and fix < 5 < 1. From 
now on, set, for some A > 1, = [A'^], for all > 1, and consider the following 
partitioning of the compact Ci , 



ci - ai 



Shi,. 



(5.11) 



where u < [u] < u + I denotes the integer part of u. 

Here, we claim that, for all e > 0, with probability one, 

max„,._, <„<„,, maxi<j<jJV7^a„(p^|„' )| 

hmsup == < (1 + £)a^,i. (5.12) 

fe^oo ^2nfe/ii,„J logfti,„J 

For any real valued function (p defined on a set B, we use the notation \\ip\\b = 
sup^g^ |(/3(a;)|, and in the particular case where B = M'", for m > 1, we will 
write \\(p\\ = W^pWb- Recall that Ki, £ = 1, . . . ,d, is oi bounded variation, and ip 
is uniformly bounded. Thus, under the hypothesis (A), there exists a constant 
< K < oo such that, for each < j < Jk and any xi G Ci, 

ii^lnji + < (5.13) 

Moreover, by (C.l), and making use of a classical conditioning argument, it 
follows from (2.8), (5.5) and (5.6) that, for all < j < Jk, k > 1, 1 < i < n, 
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Var 



G^Z,) " '\ hi 



'-1,1 



^l,rifc 



< 



E'' /(u) [Jn"-^ fjl htn^ \ ht^rik J J 

^ K2(^^M^^\du. (5.14) 

V ^l,n, / 

But, by setting h_i = (/i2.nfc, ■ • ■ , hd^uk)^ ^'Hd making use of classical changes of 
variables, it can be derived that, under {K.l) and (Q.l), for a given < 6* < 1, 



d p 



= fJ-ilu-i) +o(l), 
in such a way that 

( [ l[J;^Ke(^^)q,{xe)d^^ =g^^(u_i) + o(l). (5.15) 

Recalling the definition (2.10) of cr^,i, it readily follows, from (5.14) and (5.15), 
that, for all £ > and for n large enough, 

max Var(4;-(X„Z„<5,)) < {\ + £)ol^hi^^,. (5.16) 

In view of (5.4), (5.13) and (5.16), we can apply Bernstein's maximal inequality 
(see for instance Lemma 2.2 in [14]) to the sequence of random variables. 
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This yields, for n large enough, 

P| max max |a„(5^|„' )| > cr^,i(l + e) j2/ii^„Jlog/ii,„j| 

I nk^l<n<nk l<3<Jk ^ * J 

--2(Tii(l + £)/ii,„Jlog/ii,„J 



< 2(Jfc + l)cxp 



< 



2(Jk + l)hlZ^\ (5.17) 



Keep in mind the definition (5.11) of J^. Since, under (H.5), X]fc>i n *^ 

all £» > 0, the result (5.17) combined with the Borel-Cantelli Lemma naturally 

impHes (5.12). 

Evaluation of the oscillations. In the sequel, for any class Q of measurable 
functions, we will denote by ||an||e = sup^^g |Q;„(g)|, with q;„ as in (5.4). 

For future use, first consider a slightly wider class of functions than the one 
strictly needed in this paragraph. Namely, set 

Gk = {94,°,,nk ^ 302,"' '^'=-1 < ^ "fe' ^a,Xb e Ci, ipi,ip2 e T}. (5.18) 

Arguing exactly as in pages 17 and 18 of [15], it can be shown that, for all 
fc > 1, C/j, is included in a class G of measurable functions, which has a uniform 
polynomial covering number, i.e., such that for some Cq > and /i > 0, and 
all < e < 1, J\fie,g') < Coe"^. Here Afie,g') := sup{A/'(e, L2(IP)), P 
probability measure} denotes the uniform covering number of the class G' for 
£ and the class of norms {L2(Sr')}, with P varying in the set of all probability 
measures on P'^"'"^ (for more details, see, e.g., pp. 83-84 in [41]). 

To study the behavior of the process an(5^^„) between the grid points xij 
and xij+i, with < j < Jfc — 1, we introduce the following class of functions 

S'kj = - 9%\n^nk-i <n< Uk, xij < Xi < xij+i}. 

Note that, for every < j < Jfe — 1, we have G/.^ C ^j, C G' ■ 

Now we claim that, for any £ > 0, there exist almost surely a 6^ and a Ag 
such that, 

max„j^_j<„<„^ ||ni/2a„||g,'^ 

limsup max _ '''^ < ea^ i, (5.19) 

fc^oo o<j<Jk-i ^2nfe/ii,„J log/ii_„J 

whenever (5.11) holds with < S < Sg, I < X < Xe and Uk = [A]*". 

To establish (5.19), we will make use of Inequality A.l (see the Appendix). 
Towards this aim, first note that, since Ki is of bounded variation, we can write 
Ki = Ki i — Ki 2 where Ki^i and K12 are two non-decreasing functions of 
bounded variation on P. Clearly, Ki i and Ki,2 are such that — \Ki^i\y + 

1-^1, 2|u, with denoting total variation. Then, for all < J < Jk ^ ^ and 
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Xij < xi < xij+i, it follows that 



< 



R 



xi-X 



1,1 



hi 



^l,rifc 



x\ - ^1. 



>yl d(/fi,i(y)+i^i,2(y)) 



Since \K\{x) — K\{'y)\ < \Ki\v for all y € H, we get 

2 



E 



xij - Xi^ 



<ll/illcHAi^ 



Now setting, for < j < — 1, 



V hi,n 



xi,j-yhi,„^ 

fi{ui)dui 

xi-yhi,„ 



d{Ki,i{y) + Ki,2iy)) 



^l.rifc 



hi 



(5.20) 



al, = sup V&r{g^,{X,Y,S)), 

and making use of the same arguments as those used to derive (5.16), it is 
readily shown that 



cr^, < hi, 



\\h\\c?\K\l 



hl,n — hi,nt 



Set r = l/[7:>i(l + ^/2/A2)], where L»i and A2 are the constants involved in 
Inequality A.l. By selecting 6 > sufficiently small, and A > 1 close enough to 
1 to make max„^._j<„<„^. \hn — hn^\/hn^ as small as desired for large k (using 
(77.1-2)), we get 



2 ^22 2u 



(5.21) 



Now observe that for all < j < J/c — 1, we have < k uniformly over 

g-il! j ^fe' where k is as in (5.13). Therefore, applying Inequality A.l with 
r as in (5.21) and p = T^j2/A-2 yields 



P 



max„,_i<„<„, \\n^''^an\\g'^ 

max ' 

^'<i<-J>'-^ x/ukhi^n, log(l//ii,„J 



>s\<3Jkhl^^. (5.22) 



Arguing as before, (5.19) now follows under {H.5) from (5.22) and the definition 
(5.11) of Jfc, in combination with the Borel-Cantelli Lemma. 

Conclusion: The proof of Lemma 5.4 is completed by combining (5.12) and 
(5.19). □ 
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Lower bound part 



Lemma 5.5. Recall the definitions (5.4), (5-5) and (2.10). Under the assump- 
tions of Theorem 3.1, we have, with probability one, 



liminf sup - 

^leCi ^2/ii,„log(l//ii,„) 



> 



(5.23) 



Proof. Recall the definition (2.9), and note that, from Scheffe's Lemma, it 
follows under (A) and (C.2-3) that the frmction 



Xi 



'0V^l(a^l) 
fi{xi) 



R 



1/2 



is continuous on Ci (see Section ^.3 in for a complete proof of such continuity 
results). Then, for any e > 0, we can select a sub- interval J = [a', c'] C Ci, such 
that P{Xi G J} < 1/2 and 



inf / 



'^Vn]_(wi) 
h{ui) 



-,1/2 



R 



> crv,,i(l - e/2). 



Now, consider the following partitioning of J 

xi,, ^ A + 2jh„, for i = 1, . . . , [(B - A)/2/ii,„] - 1 
For each xi^i, 1 < i < fc„, define the function 



(x, y, c) 



•0(y) 
Giy) 




T„(x)i^i 



hi . 



if y < c, 
if y > c. 



where T„ is as in (5.6). Given these notations, the proof of Lemma 5.5 follows 
from the same lines as those used to establish Proposition 3 in [] ■")]. For the sake 
of brevity, we omit the details of these book-keeping arguments. □ 

From Lemmas 5.3, 5.4 and 5.5, we achieve the proof of Lemma 5.2. 
Under the conditions of Theorem 3.1, we readily obtain from Lemma 5.2 
that, with probability one, for any finite subclass Q (1 T , 



nhi^n 



sup sup ±{rj^ i{xi) — JSf]^ i{xi)}^^(Ti as n ^ oo. (5.24) 



Y 2| log/li,„| ^gg^^jgCi 

Therefore, to achieve the proof of Lemma 5.1 we shall show how to extend (5.24) 
to the entire class T . The following couple of lemmas arc directed towards this 
aim. 
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Lemma 5.6. Assume the assumptions of Theorem 3.1 hold. For all e > 0, we 
can find a finite subclass C T , such that, for any G J^, and for n large 
enough, 

min sup E{ [g^lJX, Z, S) - g^lJX^ Z, S)f}< eh,,„ 
where, for all ip G T, g.^^^ is as in (5.5). 

Proof. Set ujq= oj [resp. wo = If < oo] if (A)(z) [resp. (A)(m)] holds. Under 
(A), it follows from (5.5) and (5.6) that, for V'li "02 ^ ^ and xi G Ci, 



E{[5-^„(X,Z,<5)-,g-^„(X,Z,5)]^} 



= E 



r„(x)i^i(^^i-^)(V.i(y)-V2(F))] } 



G{Y) 



where /3 = G '^{ujo)\\Ki\\'^ snpf^^ y)^ c° x [0,^.0] {/x.i- (x, 2/)T'„(x)} < 00. Besides, 
since is a VC subgraph class, it is totally bounded with respect to dg, where 
Q is the uniform (0,0^0) distribution. Thus, for any e > 0, we can find a finite 
class such that 



□ 



sup mill / [tpiiy) - ip2{y)] dy<£/p. 

^lgJF1/J2ee, Jq 



Fix e > and select no > so large that (5.25) holds for all n > no. Further 
define, for all '01 7 "02 € 

d\^P,,i,2)^ sup sup E{[5;^^_JX,Z,<5)-.9;^„(X,Z,5)]'}. 

Now consider the class of functions 

5n{e) = {5;i,„ -5;i,„,d'(V'i,02) < e C^}. 

Lemma 5.7. Under the assumptions of Theorem 3.1, we have, with probability 
one, 

hmsup ^"P^-(^..V';)<^^"P^.^cJI«nllg„(.) < (5 25) 

n^oo ^2/li_„| log/li,„| 

where A is an absolute constant. 

Proof. The proof of (5.25) is similar to that of (5.19). Set Uk — 2'' and note 
that, 

max ||a„||e„(£) < max (^), 

nk-i<n<nk l<n<nk »>-v=^ 
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where Gki^) — ^nl^i+iSni^)- It is straightforward that sup,|^jgg^(-^-) \\v\\ < k, with 
K as in (5.13). Moreover, keeping in mind the definition (5.18) of C/^, we have 
Gni^) C G'k- Next, observe that, for all large fc, under {H.2), 

sup Var(i)(X, Z, (5)) < £/ii,,n__i < 2e/ii^„^. 



Arguing as before. Inequality A.l, when applied with r = V^e and p — r-y/l/A2, 
enables to complete the proof of Lemma 5.7. □ 

Recalling (5.8), the proof of Lemma 5.1 is achieved by combining (5.24) with 
the results of Lemma 5.6 and 5.7. Now, to conclude the proof of Proposition 
5.1, it is clearly enough to establish the following result. 

Lemma 5.8. Recall the definition (5.2). Under the assumptions of Theorem 
3.1, we have, 

y/nhi^n {E?7^ i(a;i)-7/^,i(xi)} 

sup sup = =0(1) as n — > oo. 

xieCii^eJ^ y/\\oghi^n\ 

Proof. From (5.1), and arguing as in (2.1), it holds that 

6^j{Z) 1 fxi-Xi 

Kf 



G(Z)/(X) /i,,„ 'V he,n 

E(vxr)|x) ' 1 fx,-x, 



Then, by making use a Taylor development of order s (rendered possible by the 
assumptions (K.l) and (C.3)), we get 

sup sup \Kfh^,n{x) - TO^(x)| = O I TT h'/ \ ■ (5-26) 

By {H.'i), the result of Lemma 5.8 is now a direct consequence of (5.2). □ 

5.1.2. Two useful approximation lemmas 

Now, we shall show how to treat the general case (i.e. when neither / nor G is 
known). Let m^^„ [resp. r^Tp.i] be the version of "^J^„(x) [resp. 77^ ^] (see (2.2) 
[resp. (2.6)]) in the case where G is known and / is unknown. Namely, we have 



nf^^\G{Z,),U^,)f}^h,,n \ he,. 



, (5.27) 

VipAi^i) = / ?7i,0,„(x)(7_i(x„i)(ix_i - / m^,„(x)(j(x)dx. (5.28) 
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Lemma 5.9. Recall the definition (5.2). Under the assumptions of Theorem 
3.1, we have, almost surely. 



sup sup ^-^ = o(lj as n ^ Qo. 



(5.29) 



Proof. Because ^(x) = Y[e=i leixi) and because the functions qi, £ = 1, . . . ,d, 

ible yield 

xe — Xi^i 



are bounded, a classical change of variable yields, for i = 1, . . . n, 

d 



<Mi, 



with < Ml < oo. Recall the definition (5.1) and set ^'(j/,c) = I{y<c}'!/'(2/ A 
c)/G{y A c), for all j/,c € H. Then, since, for £ = 1, . . . ,c?, has a compact 
support included in Ci and Ki is compactly supported, we have under {H.l) 
and for n large enough, 



,„(x) - m^,„(x)|g(x)dx < — Q)! sup 



|/»(x)-/(x)| 
c° |/(x)/„(x)| 



Clearly, by (A), is uniformly bounded. Therefore, the following result (see, 
e.g., [1]) 



sup /„(x)-/(x) =0 



'logri 



a.s., as n ^ oo, 



is enough to conclude under (C.4) that, almost surely as n oo, 

[ 



'\ogn 



|m^,„(x) - m^0,„(x)|(7(x)(ix = O j 
Similarly, it can be shown that, almost surely as n ^ cx), 

|m^,„(a;i,x_i) - TO^,„(a;i,x_i)|g_i(x_i)dx_i = 0\ 



sup 



I log n 



From these two last statements and the definitions (5.2) and (5.28), the proof 
of Lemma 5.9 is completed under the assumption [H.A). □ 

Lemma 5.10. Recall the definitions (2.6) and (5.28). Under the assumptions 
of Theorem 3.1, we have, almost surely as n — s- oo, 



sup sup — = o(l). 

xieciiiey^ ^|log/ii,„| 



(5.30) 
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Proof. First consider the case where (A)(i) holds. Set b ~ infxec° fi^)- Note 
that 6 > by (C.4). Then, recalhng (2.2) and (5.27) and arguing as we did 
along the proof of Lemma 5.9, we get 

/ N ~ /M.N, Ml flV'(y)l • |G*(y)-G(y)n 

R'i b + o{l) o<y<ujl \G*{y)G(y)\ J 

Since u < T}j, the iterated law of the logarithm of [18] ensures that 

sup \GUy) - G{v)\ = 0((loglogn/n)i/2) 

y<uj 

almost surely as n ^ oo. Therefore, it follows under (C.2-3-4) that, almost 
surely as n ^ oo. 



|m;.„(x)-™^,„(x)|g(x)dx = o( Ji^i^ |. (5.31) 
In the same spirit it can be shown that, almost surely as n — > oo. 



sup / \ml „(x) - m^,n(x)|g_i(x_i)rfx_i = \ . r'^^^^^n \ ^ ^^ ^2) 

which, by (-ff.5), completes the proof of Lemma 5.10 under A(i). In the case 
where (A)(m) holds, the proof follows from the same lines as above, making use 
of either the iterated law of the logarithm of [20] (if (A)(m) holds with p = 1/2) 
or Theorem 2.1 of [6] (if (A)(m) holds with < p < 1/2) instead of the iterated 
law of the logarithm of [18]. The details are omitted. □ 

By combining Lemmas 5.9 and 5.10 with Proposition 5.1, we conclude the 
proof of Theorem 3.1. 

5. 2. Proof of Lemma 5. 3 
Set 

*(F„ G) = ^Y.<c^Hy^ A C,)/G{Y, A C,) = 5,iiZ,)/G{Z,) 

g(xi) = E(*„(r,,Q)|X,,i -a-i), (5.33) 
and (3i{xi) = — — 2^ ^ . KA — 
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It follows that, 
Var(/3i(xi)) 



1 " 

„2 



hi,' 



.fl{Xi,l)hi^n 



— $i,„(xi) - [ri,„(xi)]2 



Observing that g is uniformly bounded under the assumptions of Theorem 3.1 
and making use of some conditioning arguments, it can be shown that 



1 



rin(^i)^0 as n^Qo. 
ti ' 

Moreover, by using the change of variable wi/ii,„ — xi — ui, we obtain 

KKvi) 



(5.34) 



fl{xi - hi,nVi) 



h{xi) 



dvi 



Kfivi)dvi. 



fiixi) 

But, recalling the definition (2.9) of the function (t>t(,,i., the quantity 
E(#(y,,C,;)| ^Xi- /ll,„fl) <t)4,,i{xi) 



Mxi) 



is clearly bounded under the assumptions (C.4), (C.6), {K.l) and (Q.l). There- 
fore, Lebesgue's dominated convergence Theorem enables us to conclude that 



0'<A, 1(2^1) 



Kfivi)dvi. 



h{xi) 

Combining (5.34) and (5.35) we obtain, for all xi e Ci, 

Var(/3i(xi)) = E{A(xi) - E/3i(.ti)}' = ©(n-^^/^^.+i)) ^ 

Then, 



(5.35) 



Var(/3i(a;i))(ia;i 



Ci 



= / E{/3i(xi)-E/3i(xi)}'dxi 
= e(^^ {/3i(xi)-E/3i(xi)}'dx-i 



00 



-2fc/(2fc+l)^ 



(5.36) 
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Recall the definitions (5.1), (5.4), (5.5), (5.6) and (5.33). From (5.7), and using 
the Cauchy-Scliwartz inequality, we obtain. 



^2/ii,„| log/ii,, 



nhi . 



2|log(/ii,„)| 



nhi . 



< 



2|log(/ii,„)| 



y 2|log(/ii,„)| ^ 

From (5.36) and (5.37), it follows that, almost surely as n ^ oo, 



/r ^/2/ii,„| log/ii,, 
But, from (5.23), we have 

sup 



qi{xi)dx^ 
{m^,„(x) - ETO^,„(x)}g(x)(ix 
{f3i{xi) - El3i{xi)}qi{xi)dxi 
{/3i(a;i)-E/3i(xi)}'dxi/ (72(3,^)^3,^. (5.37) 



(5.38) 



qi{xi)dxi 



a, 



iGCi \/2hi^n\ log/li 



> 



/ni/(2^+i)/ii^ 
I log^i,,i| 



(5.39) 



The proof of Lemma 5.3 is readily achieved by combining (5.38) and (5.39) with 
the condition {H.3). □. 

5.3. Proof of Theorem 3.2 

Recall the definitions (2.6) and (2.7) and observe that. 



2| log/ii^„| .^e^xec 



sup sup±{m^^^rf^(x) - m^(x)} - ^ cr^ 



d 

^ E 

1=1 



nhi \ 



sup sup ±{^^^t{xi) - rf^j^yxi)} - ai 



2| log/li^„| ^^Txif^Ct 

{"^V-^nW ~ '71^.(x)}q(x)c!x 



nhin 



Y 2| log/ii,, 

Under the assumption {H.3) and (HA), by proceeding as we did along the proof 
of Lemma 5.3 (see also (5.26) and (5.37)), we get 



nhi_ 



2\\oghi, 



{m^, „(x) — m{x)}q{x)dic = o(l) a.s. 



(5.40) 



By combining Theorem 3.1 and the statement (5.40), we complete the proof of 
Theorem 3.2. 



M. Debbarh and V. Viallon/ Additive regression with right censored data 



539 



Appendix 

In this section, we present an inequality which was of particular interest for our 
task. It is due to Mason [33], who derived it from an inequality obtained by 
Talagrand [39]. 

Inequality A.l. Let Z, Zi, . . . , Zn be i.i.d. random variables, n > 1. Denote 
by a class of functions such that 

sup Var if{Z)) < r^h, with r, /i > 0. 

Assume there exist constants M, C and > 0, fulfilling, for all < e < 1, 
JV{e,T) <Cs-'' and sup |/(z)| < M. 

Choose any p > 0. Then there exist a universal constant A2 > 0, and a constant 
Di = Di{v) > 0, depending only on i', such that, ii h > satisfies 



K^ := max < , — — > < 

r 



then we have, with = XiJ^i {g{Zj) - E{g{Z))} for g £ J^, 



P( sup |1T,„(-)|1^> {t + p)Di^nh\logh\) <2cxp( -^|log/i| 
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